Goto

Collaborating Authors

 body model


Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation

Neural Information Processing Systems

T o this end, we propose a simple yet powerful paradigm for seamlessly unifying different human pose and shape-related tasks and datasets. Our formulation is centered on the ability - both at training and test time - to query any arbitrary point of the human volume, and obtain its estimated location in 3D. We achieve this by learning a continuous neural field of body point localizer functions, each of which is a differently parameterized 3D heatmap-based convolutional point localizer (detector).






MIMIC-MJX: Neuromechanical Emulation of Animal Behavior

Zhang, Charles Y., Yang, Yuanjia, Sirbu, Aidan, Abe, Elliott T. T., Wärnberg, Emil, Leonardis, Eric J., Aldarondo, Diego E., Lee, Adam, Prasad, Aaditya, Foat, Jason, Bian, Kaiwen, Park, Joshua, Bhatt, Rusham, Saunders, Hutton, Nagamori, Akira, Thanawalla, Ayesha R., Huang, Kee Wui, Plum, Fabian, Beck, Hendrik K., Flavell, Steven W., Labonte, David, Richards, Blake A., Brunton, Bingni W., Azim, Eiman, Ölveczky, Bence P., Pereira, Talmo D.

arXiv.org Artificial Intelligence

The primary output of the nervous system is movement and behavior. While recent advances have democratized pose tracking during complex behavior, kinematic trajectories alone provide only indirect access to the underlying control processes. Here we present MIMIC-MJX, a framework for learning biologically-plausible neural control policies from kinematics. MIMIC-MJX models the generative process of motor control by training neural controllers that learn to actuate biomechanically-realistic body models in physics simulation to reproduce real kinematic trajectories. We demonstrate that our implementation is accurate, fast, data-efficient, and generalizable to diverse animal body models. Policies trained with MIMIC-MJX can be utilized to both analyze neural control strategies and simulate behavioral experiments, illustrating its potential as an integrative modeling framework for neuroscience.


Dream, Lift, Animate: From Single Images to Animatable Gaussian Avatars

Bühler, Marcel C., Yuan, Ye, Li, Xueting, Huang, Yangyi, Nagano, Koki, Iqbal, Umar

arXiv.org Artificial Intelligence

We introduce Dream, Lift, Animate (DLA), a novel framework that reconstructs animatable 3D human avatars from a single image. This is achieved by leveraging multi-view generation, 3D Gaussian lifting, and pose-aware UV-space mapping of 3D Gaussians. Given an image, we first dream plausible multi-views using a video diffusion model, capturing rich geometric and appearance details. These views are then lifted into unstructured 3D Gaussians. To enable animation, we propose a transformer-based encoder that models global spatial relationships and projects these Gaussians into a structured latent representation aligned with the UV space of a parametric body model. This latent code is decoded into UV-space Gaussians that can be animated via body-driven deformation and rendered conditioned on pose and viewpoint. By anchoring Gaussians to the UV manifold, our method ensures consistency during animation while preserving fine visual details. DLA enables real-time rendering and intuitive editing without requiring post-processing. Our method outperforms state-of-the-art approaches on the ActorsHQ and 4D-Dress datasets in both perceptual quality and photometric accuracy. By combining the generative strengths of video diffusion models with a pose-aware UV-space Gaussian mapping, DLA bridges the gap between unstructured 3D representations and high-fidelity, animation-ready avatars.


A Appendix 549 A.1 Dataset Information

Neural Information Processing Systems

Currently the dataset can be downloaded under this link (2.2 GB, compressed tar file): The Muscles in Time dataset will be published under a CC BY -NC 4.0 license as defined under Our data generation pipeline is licensed under Apache License V ersion 2.0 as defined under Data structure The structure of the provided MinT data is intentionally kept simple. The first and last 0.14 seconds are cut off since the muscle activation A short example on the musint package usage is displayed in Listing 2. In Figure 9 we provide additional information on the data analyzed provided with Muscles in Time. Total Capture makes up a small part of the dataset with exceptionally long sequences. Dataset provides the largest contribution with 3.2h of analyzed recordings. The muscle-driven simulation, based on the approach by Falisse et al .